Search CORE

2 research outputs found

How Does BERT Answer Questions? A Layer-Wise Analysis of Transformer Representations

Author: Andrew
Belinkov Yonatan
Comon Pierre
Conneau Alexis
Dehghani Mostafa
Devlin Jacob
Dosilovic F. K.
Hupkes Dieuwke
Jain Sarthak
Liu Nelson F.
Mikolov Tomas
Nagamine Tasha
Questions
Seo Min Joon
Tenney Ian
van der Maaten Laurens
Vaswani Ashish
Voorhees Ellen
Weston Jason
Yang Zhilin
Zadeh Lotfi A
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 11/09/2019
Field of study

Bidirectional Encoder Representations from Transformers (BERT) reach state-of-the-art results in a variety of Natural Language Processing tasks. However, understanding of their internal functioning is still insufficient and unsatisfactory. In order to better understand BERT and other Transformer-based models, we present a layer-wise analysis of BERT's hidden states. Unlike previous research, which mainly focuses on explaining Transformer models by their attention weights, we argue that hidden states contain equally valuable information. Specifically, our analysis focuses on models fine-tuned on the task of Question Answering (QA) as an example of a complex downstream task. We inspect how QA models transform token vectors in order to find the correct answer. To this end, we apply a set of general and QA-specific probing tasks that reveal the information stored in each representation layer. Our qualitative analysis of hidden state visualizations provides additional insights into BERT's reasoning process. Our results show that the transformations within BERT go through phases that are related to traditional pipeline tasks. The system can therefore implicitly incorporate task-specific information into its token representations. Furthermore, our analysis reveals that fine-tuning has little impact on the models' semantic abilities and that prediction errors can be recognized in the vector representations of even early layers.Comment: Accepted at CIKM 201

arXiv.org e-Print Archive

Crossref

Analysis Methods in Neural Language Processing: A Survey

Author: Adi Yossi
Adi Yossi
Aharoni Roee
Ahmad Wasi Uddin
Alishahi Afra
Alvarez-Melis David
Alzantot Moustafa
Arras Leila
Arras Leila
Artetxe Mikel
Aubakirova Malika
Bahdanau Dzmitry
Bau Anthony
Bawden Rachel
Belinkov Yonatan
Belinkov Yonatan
Belinkov Yonatan
Belinkov Yonatan
Bernardy Jean-Philippe
Bisazza Arianna
Blevins Terra
Bodén Mikael
Bowman Samuel R.
Bruni Elia
Brunner Gino
Burchardt Aljoscha
Burlot Franck
Cer Daniel
Chaabouni Rahma
Chalup Stephan K.
Chang Jonathan
Chen Hongge
Chen Xinchi
Chen Yining
Cheng Minhao
Chrupała Grzegorz
Conneau Alexis
Cífka Ondřej
Dalvi Fahim
Dalvi Fahim
Dalvi Fahim
Das Sreerupa
Dasgupta Ishita
Dharmaretnam Dhanush
Ding Yanzhuo
Doshi-Velez Finale
Doshi-Velez Finale
Drexler Jennifer
Ebrahimi Javid
Ebrahimi Javid
Elkahky Ali
Elloumi Zied
Elman Jeffrey L.
Elman Jeffrey L.
Ettinger Allyson
Faruqui Manaal
Faruqui Manaal
Feng Shi
Finkelstein Lev
Frank Robert
Freeman Cynthia
Fyshe Alona
Gaddy David
Ganesh J.
Gelderloos Lieke
Gers Felix A.
Gerz Daniela
Ghader Hamidreza
Ghaeini Reza
Giulianelli Mario
Glockner Max
Godin Fréderic
Goldberg Yoav
Gonzales Annette Rios
Goodfellow Ian
Goodfellow Ian
Goodfellow Ian J.
Gulordava Kristina
Gupta Abhijeet
Gupta Pankaj
Gururangan Suchin
Harris Catherine L.
Harwath David
Heigold Georg
Hupkes Dieuwke
Isabelle Pierre
Isabelle Pierre
Isahara Hitoshi
Iyyer Mohit
Jacovi Alon
James Murdoch W.
Ji Gao
Jia Robin
Jozefowicz Rafal
Karpathy Andrej
Ke Tran
Khandelwal Urvashi
King Margaret
Koh Sungryong
Köhn Arne
Lake Brenden
Lehmann Sabine
Lei Tao
Leviant Ira
Li Jiwei
Li Jiwei
Liang Bin
Lipton Zachary C.
Liu Nelson F.
Liu Yanpei
Luong Thang
Maillard Jean
Marelli Marco
Miikkulainen Risto
Mikolov Tomáš
Ming Yao
Montavon Grégoire
Mudrakarta Pramod Kaushik
Mullenbach James
Murphy Brian
Nagamine Tasha
Nagamine Tasha
Naik Aakanksha
Narodytska Nina
Niklasson Lars
Niu Tong
Papernot Nicolas
Papernot Nicolas
Papernot Nicolas
Park Dong Huk
Park Sungjoon
Peters Matthew
Poliak Adam
Poliak Adam
Pollack Jordan B.
Qian Peng
Qian Peng
Ribeiro Marco Tulio
Rikters Matīss
Rocktäschel Tim
Rozsa Andras
Rudinger Rachel
Rush Alexander M.
Sakaguchi Keisuke
Samanta Suranjana
Sanchez Ivan
Sato Motoki
Senel Lutfi Kerem
Sennrich Rico
Shi Haoyue
Shi Xing
Shi Xing
Singh Chandan
Strobelt Hendrik
Strobelt Hendrik
Sundararajan Mukund
Sutskever Ilya
Suzgun Mirac
Szegedy Christian
Tang Gongbo
Thomas McCoy R.
Unanue Inigo Jauregi
Vanmassenhove Eva
Veldhoen Sara
Voita Elena
Vylomova Ekaterina
Wang Alex
Wang Shuai
Wang Xinyi
Wang Yu-Hsuan
Weiss Gail
Yang Puyudi
Ye Zhang
Yi Tay
Yuan Xiaoyong
Zaidan Omar
Zhang Quan-shi
Zhao Jieyu
Zhao Junbo
Zhao Zhengli
Zhizheng Wu
Publication venue: 'MIT Press - Journals'
Publication date
Field of study

Crossref